Introduction

We decided to work on a wine data review that was available on kaggle. This dataset contains information that can be found on the website ā€œWine Enthousiastā€. We created an interactive map that is easy to navigate through the different countries in the World. Indeed, we did not find any app or website that had a clear review of the wines throughout the World. With our map, one is able to see the average quality, the most common type and the amount of wine that was review for each country. Moreover, one can observe on the right side of the dashboard, the ā€œWine data explorerā€. There, you can change the variety of the wine as well as the country, to see the relation between quality and price for the variety/varieties for the country/countries. There is also an option on the dashboard to explore the data of wine review. This allows you to search for a specific wine and access all important information, such as the price or the winery. One can also vary the price range in order to find wines specific to everyone’s willingness to pay.

Data Cleaning

Wine Review Dataset

The original dataset was retrieved from kaggle containing the following 14 variables: - X: Integer variable, indicates the observation number. - Country: Character variable, shows from which country the wine comes from.
- Description: Character variable, gives information about the wine. - Designation: Character variable, indicates the vineyard of the winery. - Points: Integer variable, contains the number of points that was given to a specific wine. It varies from 1 to 100 where 100 indicates the best possible wine and 1 the worst one. - Price: Numerical variable, indicates the price (USD) of the bottle of wine. - Province: Character variable, shows from which province/state the wine comes from. - Region 1: Character variable, indicates the area of where the wine is from. - Region 2: Character variable, indicates more specific information about the region were the grapes grow. - Taster Name: Character variable, indicates the name of the taster of the wine that posted the review on the website Wine Enthousiast. - Taster Twitter Handle: Character variable, includes information about the Twitter account of the wine taster. - Title: Character variable, contains information about the wine such as its name, type and year. - Variety: Character variable, indicates the type of grapes used for the wine ( e.g.ā€œPinot Grisā€). - Winery: Character variable, indicates the winery.

This dataset contains 129971 observations.

We then deleted the unnecessary variables, i.e.Ā the X, the Taster Name, and the Taster Twitter Handle. Moreover, we deleted the rows containing missing information about the price, the country, the variety and the number of points. We aslo changed the names of the countries: United States of America and United Kingdom so that they would match the names of our second database (i.e.Ā countries).

Thus, this new database contains 12 variables and 101400 observations.

Lastly, we changed the format of the database into a RDA file so that people could easily retrieve it when using our package.

Countries Dataset

In order to get information about the countries’ coordinates in the World, we used the geocountries function from the Geodata data package. This allowed us to create our World Map.

Countries and Wine reviews databases

In this part, we first needed to create a variable containing the value of the average points, regarding the quality of the wines from the different countries. We then joined it to its respective countries in the database ā€œcountriesā€. Following that, we wanted to know the number of unique countries and varieties in our dataset. Finally, we created new labels containing the most popular wine and the total number of wines for each country. The results are displayed in the map with different colors, the darkest colors can either indicate the country with the highest average or the highest number of wines reviewed in that country, depending on the label chosen in the shiny app.

Example of our Wine Review World Map

This map shows us the average wine quality of each country in our database. As explained above, the darker the color, the better the quality. One can assess the exact average by moving the mouse on each different countries. For instance, the country with the highest average quality wine is the UK with the average of 91.55 and the lowest is Ukraine with an average of 84.07.